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ABSTRACT 

We present a new method of extending the single band Analysis of Vari¬ 
ance period estimation algorithm to mnltiple bands. We nse SDSS Stripe 82 
RR Lyrae to show that in the case of low nnmber of observations per band and 
non-simnltaneons observations, improvements in period recovery rates of np to 
~60% are observed. We also investigate the effect of inter-band observing ca¬ 
dence on period recovery rates. We hnd that nsing non-simnltaneons observation 
times between bands is ideal for the mnltiband method, and nsing simnltaneons 
mnltiband data is only marginally better than using single band data. These 
results will be particularly useful in planning observing cadences for wide-held 
astronomical imaging surveys such as LSST. They also have the potential to im¬ 
prove the extraction of transient data from surveys with few (< 30) observations 
per band across several bands, such as the Dark Energy Survey. 

Subject headings: methods: data analysis — stars: variables: general — stars: 
variables: RR Lyrae — surveys 
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1. Introduction 

The period-luminosity relationship of variable stars, hrst discovered by Henrietta 
Leavitt and calibrated by Ejnar Hertzsprung (Hertzsprung 1913), is an important step in 
the astronomical distance ladder. With applications to measuring the Hubble constant 
(Riess et al. 2011) and mapping out Galactic substructure (Sesar et ah 2010), periodic 
variables are key science drivers for next-generation astronomical imaging surveys such as 
the Large Synoptic Survey Telescope (LSST, Ivezic et al. 2008) and Gaia (Eyer et al. 2015). 
Approximately 50 million variable stars will be detected by LSST (Sesar et al. 2007) and 
18 million variables by Gaia (Eyer & Guypers 2000), therefore, automated classihers must 
be relied upon to hnd the variable sources and determine the period of the source, if it is 
periodic. 

Numerous period Ending algorithms have been implemented over the years (see 
Graham et al. 2013 for a comparison of various algorithms). One common characteristic 
of most modern period finding algorithms is the use of observational data in a single 
band. For current generation transient surveys such as the intermediate Palomar Transient 
Factory (iPTF; 10-5000 observations in R band for certain fields^. Law et al. 2009) and 
Optical Gravitational Lensing Experiment (OGLE; 400-500 observations in I band for LMG 
objects e.g., Soszyhski et al. 2009a,b; Udalski et al. 1992), the volume of data in any one 
band is sufficient to accurately determine the period, rendering the use of additional bands 
redundant. However, in multiband surveys in which only a limited number of observations 
are available in each band, single band algorithms can struggle due to poor phase coverage 
(Graham et al. 2013). 

Multiband period finding methods have been explored before, but the proposed 


^http: //www.ptf.caltech.edu/page/first_data_release 
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methods require either simultaneous measurements (Suveges et ah 2012) or require that a 
period be correctly recovered by a single band algorithm in the majority of bands sampled 
(Oluseyi et ah 2012). The former case puts strict requirements on observing strategy, while 
the latter still suffers from the inability of single band algorithms to return accurate results 
with limited observations. Only recently (Long et ah 2014; VanderPlas & Ivezic 2015) have 
methods been proposed that are general in the sense of allowing arbitrary observation times 
and fully incorporating data from multiple bands into an algorithm. 

In this Letter, we propose a method to extend the Analysis of Variance (AoV, 
Schwarzenberg-Czerny 1996) single band algorithm to multiple bands. The method 
improves period recovery rates for poorly sampled multiband light curves. In addition, we 
discuss the importance of observational cadence between the bands to be used, and show 
that non-simultaneous observations between bands increases the ability of our multiband 
algorithm to recover the correct period. 


2. Data 

In this Letter, we select a sample from the 483 RR Lyrae stars from Sesar et ah (2010), 
and use light curves from the Sloan Digital Sky Survey (SDSS) Stripe 82 Variable Source 
Catalog (Ivezic et al. 2007). These stars have a reasonably large number of observations, 
with a median number of observations per band of 56 across the SDSS g, r, and i bands, 
and 55 in u and z The data span 3340 days. Of the 483 sources found in Sesar et al. 
(2010), 33 were either not found in the Variable Source Catalog, or had <10 observations 
in one or more bands. Table 1 gives a complete description of the number of RR Lyrae as 
a function of number of observations and downsampling method (described in section 3.2). 
The three band sample uses only the g, r, and i bands. It should also be noted that the 
typical time for SDSS to complete one pass through all Liters is ~ 0.004 days (5.7 mins. 
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York et al. 2000). 


3. Method 

3.1. AoV Multiband Extension 

As our model for the variation of brightness with time in a single band, we adopt a 
sinusiod with K harmonics. Assuming rib observations in each of B bands, our data are of 
the form {(4i, rubj, where tti is the time of the f-th observation in band b, mu is 

the measured magnitude at that time, and au is the uncertainty associated with mbi- We 
assume cn, the frequency, is constant across all bands. Our model can be written as 

K 

mu = /So6 + ^ a,bk sin{kutbi + (t)bk) + ^u (1) 

k=l 

K 

= Pob + '^{a-bk cos(0bfc) sm{kutu) + abk sin(06fc) cos{kutbi)) + eu 

k=l 

where eu ~ -^(0, are independent across i and b. This model is equivalent to the 
multiphase Nbase = 0, Nband = K model of VanderPlas & Ivezic (2015). The periodogram 
we construct (see Equation 5, this article) is different than that of VanderPlas & Ivezic 
(2015). See Equation 5 and the discussion in section 5 for more details. Long et al. (2014) 
studies this model with K = 1 and termed it MGLS. The authors did not construct 
periodograms for this model, and did not study the effects of inter-band observing cadence 
on period recovery. 

One natural approach for estimating oo is to use maximum likelihood. Let 
= (ofei,..., ttbK) and a = (ai,... a^). Analogous dehnitions apply for and (p. Let 
/3o = (/doi,... ,(3ob)- Since the error model is normal, maximum likelihood is equivalent 
to Ending the oo which minimizes the weighted sum of squares, sometimes known as 
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‘chi-squared minimization.’' 


. f rribi- Yji^bk cos(06fc) sm{ku:tbi) + a^k sin(0;,fc) cos{kutbi)) - ^ 

u = argmm mm > > - 

^ a,0,/3o ^ V (^bi 

6=1 1=1 ^ 


B 


rib 


= arffmin > min > 

uj ab,4>b,0ob ^ 


'Tbibi - J2{(^bk cos{(j)bk) sm{kutbi) + abk sm{(j)bk) cos{ku}tbi)) - (5ob 


^bi 


b=l " i=l 

We moved the min inside the sum over b because the 6*^ summand only depends on 

^6: ^65 /^06* 

The sum over i can be simplihed by noting the linearity of the model and 
reparameterizing. Let nifc = (mw,..., rribubV- Let (3bki = abk cos{4>bk) and I3bk2 = abk sm{(j)bk)- 
Dehne /3b = (Pob, An, A 12 , • • •, Ai^i, Ak 2 )^ e Let be a n?, x Ub diagonal matrix 


where T^bu = af-. Dehne 

/ 


XJu;) = 


1 sin(ci;4i) cos(a;4i) 

1 sin ( 0742 ) cos{ujtb 2 ) 

sin (uitbub) cos (uitbub) 

We rewrite the ML estimator as 

B 


sin(/La;t6i) cos{Kutbi 
sm{Kutb2) cos{Kutb2 


\ 


sm 


{Kutbnb) COs{Kutbnb) y 


■x(2K+l) 


u = argmin ^ min (rrib - Xfe(a;)/3J'^Sj - Xb{uj)(3b) 


6=1 


The problem is now identical to weighted least squares so the jSb which minimizes the 
expression is 

Moj) = 

Dehne 

RSSbioo) = (m, - Xb{uj)^b{oo)f'^b\^b - Xb{uj)^b{oo)) (2) 

and we have 


B 


ui = argmin 




b[UJ 


(3) 


6=1 
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One can reconstruct maximum likelihood estimators for the original parameterization from 
the /3b(0). From a computational perspective, the ML estimator requires performing B 
weighted least squares estimates at each frequency. 

Rather than obtain a single period estimate, it may be useful at any proposed uj to 
have a measure of the conhdence that uj is the true frequency. Periodograms are functions 
which map frequencies to some measure of conhdence. Often periodograms are constructed 
so that under the null hypothesis of no magnitude variation (i.e., mbi = Pbo + Cbi), the 
periodogram has a known distribution at any particular frequency. We construct such a 
periodogram for the model specihed by Equation (1). The frequency which maximizes this 
periodogram will be shown to be the maximum likelihood estimator in Equation (3). This 
periodogram is a direct generalization of the AoV periodogram of Schwarzenberg-Czerny 
(1996) to multiband data because: 

• With a single band, the periodogram simplihes to the AoV periodogram of 
Schwarzenberg-Czerny (1996). 

• The periodogram retains the F distribution under the null hypothesis of constant 
magnitude in every band. 

We now discuss how to construct the periodogram following the notation of Section 
3.1. We then go into further detail regarding the equivalence of this periodogram to 
Schwarzenberg-Czerny (1996), and compare this periodogram to VanderPlas &: Ivezic 
(2015). 

Under the notation of the previous section the model in Equation (1) can be written as 


mb = Xb(a;)/3b -h 


(4) 


where Sb ~ -^(0, Sb) for all b. Consider testing the null hypothesis 


hfo : mb = l/Sbo + ef, V &. 


This hypothesis states that the magnitude is a constant (3bo in each band. Since the hrst 
column of Xb(a;) is 1, this is a submodel of Equation (4). The weighted least squares 
estimator for the submodel is 

^ rib 






^pnb -2 


bi i—i ^bi 


The residual sum of squares is 


= (mb - IhoY S-'(mb - lAo) 

Standard results in statistics (for example Sections 2.5 and 2.6 of Scheffe 1959) show that 
under the null hypothesis 

RSSl-RSSb{u:)r^xlK 

RSSb{uj) ~ Xnb-2K-1 

where x] refers to a chi-squared distribution with j degrees of freedom. Further these two 
quantities are independent. Since the sum of random variables is we have 

B B 

Y, nss! - Y, RSSiiw) ~ xIkb 


6=1 


6=1 


B 


Y,RSSb{u:)r^x\^B_^ 


. rib—2KB—B 


6=1 


Finally we dehne the periodogram at frequency u to be the ratio of these quantities divided 
by their respective degrees of freedom 

(Ef.i m - -iKB - B) (Eti RSSS - Ef.! RSStH) 

©(cu) =-- - (5) 

2KB{J2^^^RSSb{u)) 

Under Hq, ©(cn) ~ F^rby,^ ^nb- 2 KB-B- comments on the periodogram: 
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• In practice, the frequency which maximizes the periodogram is often used as a 
period estimate. The frequency which maximizes the periodogram will minimize 
Ylb=i RSSb{u}), which we showed in Equation (3) is the maximum likelihood estimator. 


• With a single band the periodogram becomes 


©M 


{n-2K- 1) {RSS^ - RSS{u)) 
2K{RSS{u)) 


which matches the periodogram of Schwarzenberg-Czerny (1996) Equation 11 
(although with different notation). 


In addition to developing the single band AoV algorithm, Schwarzenberg-Czerny 
(1996) also developed a fast routine for evaluating RSSh{uj) based on Ending orthogonal 
polynomials on the unit circle. For this reason, we use a Fortran implementation of the 
single band AoV algorithm^ as the basis for constructing our multiband periodogram. A 
small python code demonstrating how to do this been made available^. As input parameters 
to the single band AoV algorithm used in this work, we use a minimum frequency of 1 
day“^, an upper frequency of 5 day“^, a frequency step of 0.0001 day“^ and one harmonic 
(corresponding to FR0=1, FRU=5, FRS=0.0001, and NH2=2 in the AoV code). 

This periodogram is difierent from that of VanderPlas & Ivezic (2015) (see Equation 
22 in their paper). They use RSS^ in the denominator instead of RSSb{u)). The 
unregularized models of VanderPlas & Ivezic (2015) follow an incomplete beta distribution 
(see Schwarzenberg-Czerny (1998), Eqn. 6). It should also be noted that this multiband 
method implicitly assumes that the period of oscillation is the same for each band. If the 
period varies significantly across bands, this method will not be suitable for use. 

^http://users.camk.edu.pl/alex/ 

^https://github.com/Mondrik/Multiband_AoV_Demo 




Figure 1 compares the multiband periodogram with its single band components. 


3.2. Testing the Algorithm 

To test the algorithm, we downsample the number of observations per band for 
each light curve using both simultaneous and non-simultaneous downsampling. For 
non-simultaneous downsampling, observations (consisting of a time of observation, band, 
magnitude, and photometric error) are randomly selected from all available observations. 
Observations are selected until all bands have riobs observations. For simultaneous 
downsampling, an observation in one band is chosen. We then choose observations in the 
other bands such that the absolute difference in observation times is < 0.005 days (7.2 
minutes) from the initial observation time. This is repeated Uobs times. Since the time 
for SDSS to complete one pass through all hlters is about 5.7 minutes (York et ah 2000), 
these observations are as close in phase space as possible. We also choose to use a flat time 
difference rather than a fraction of the known period in order to mimic the lack of a priori 
knowledge of the variable object, as is the case in survey planning. We then dehne a period 
as correctly recovered if \PAig — PTrue\ < 0.001 days, where Paiq is the period corresponding 
to the largest value of the multiband or single band AoV periodogram, and Prme is the 
period as measured by Sesar et ah (2010). The results are shown in Figure 2. Error bars are 
estimated by assuming a binomial distribution at each riobs characterized by the estimated 
completeness and number of objects. 
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4. Discussion 

4.1. Benefit of Multiband over Single Band 

The most striking result shown in Figure 2 is the large separation between the 
multiband non-simultaneous completeness fraction and the single band non-simultaneous 
completeness fraction; even the use of as few as 3 bands can signihcantly improve recovery 
rates over single band methods. For surveys with a low number of observations per band 
ij^obs ^ 30), such as the Dark Energy Survey (The Dark Energy Survey Collaboration 2005), 
multiband methods can provide a signihcant increase in fraction of correctly recovered 
periods, allowing for more accurate classihcation of transient objects in the survey. 


4.2. Impact of Inter-band Observing Cadence 

The second major result noticeable in Figure 2 is the difference between the 
simultaneous and non-simultaneous downsampling groups. It should be noted that in the 
single band case, simultaneous and non-simultaneous downsampling should have no effect, 
so the scatter between the two is indicative of the randomness in choosing the observations. 
The difference between multiband non-simultaneous and simultaneous downsampling arises 
primarily from the increase in phase space coverage of the non-simultaneous downsampling 
relative to the simultaneous downsampling. In the case of simultaneous downsampling, 
the additional bands add little new information about the light curve not contained in 
other bands, leading to poorer performance, despite having the same number of total 
observations. 

Figure 3 demonstrates the failure modes of our multiband model in the case of 
non-simultaneous and simultaneous observations. We plot the best £t period from the 
AoV multiband model with 15 observations per band against the period taken from Sesar 
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et al. (2010). In the non-simultaneous case, the multiband method fails primarily along 
beat periods, given by = P/(l + nP) for integer n, as discussed in VanderPlas & Ivezic 
(2015). In the simultaneous case, the multiband method tends to fail in a much more 
random fashion. 


4.3. Constructing Data Sets for Maximized Period Recovery Rates 

The improvement of recovery rates with unique phase space observations (non- 
simultaneous observations) suggests that recovery rates cannot be signihcantly improved by 
downsampling 25 simultaneous observations to, for example, sets of 12 observations that 
are less simultaneous. Fundamentally, there are only 25 unique phase space observations 
of the object, which constrains the maximum amount of phase space separation between 
the bands. In order to construct a truly non-simultaneous dataset, we require a minimum 
of Uobs per band times the number of bands. Hence our non-simultaneous downsampling 
set is not truly non-simultaneous, since we are randomly downsampling from less than 125 
observations across 5 bands (at 25 observations per band). Since the typical number of 
observations per band is ~ 55, it is impossible for us to separate the observations completely 
for Hobs > 11 in 5 bands. In this case, the algorithm is limited by the construction of the data 
set, i.e., how non-simultaneous the observations are. It would therefore be advantageous to 
construct a data set that is as non-simultaneous as possible, rather than downsample from 
a simultaneous data set, in order to use the maximum number of observations. 


4.4. Implications for Future Imaging Surveys 

This method of constructing a data set has a major potential impact on observational 
cadence planning for upcoming wide-held imaging surveys such as LSST. By varying 
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observation times between bands, accurate periods for variable sources can be estimated 
much sooner than otherwise possible. The phase space effect also has implications for our 
ability to extract transient data from ongoing surveys such as the Dark Energy Survey, 
which could see a boost in period recovery rates by employing an algorithm similar to our 
proposed method. 


5. Comparison with VanderPlas &: Ivezic (2015) 

As we mentioned earlier, another method similar to ours in spirit is that of VanderPlas 
& Ivezic (2015). Both methods use truncated Fourier series to model given observations 
across an arbitrary number of bands. However, the method used in VanderPlas & Ivezic 
(2015) is effectively an extension of the Lomb-Scargle periodogram (Scargle 1982; Lomb 
1976), while ours extends the Analysis of Variance periodogram. In the single band 
case, Graham et ah (2013) show that the multiharmonic AoV algorithm (AOVMHW 
in their notation, AoV in ours) tends to be among the top performers in any test of 
period estimation. Schwarzenberg-Czerny (1998) asserts that statistically, the use of PDM 
(Jurkevich 1971; Stellingwerf 1978), AoV, and statistics is largely a matter of taste, 
although it will be interesting to perform another analysis similar to that of Graham et ah 
(2013) on multiband data using methods such as ours and that of VanderPlas & Ivezic 
(2015) to determine when each algorithm is most effective. 


6. Conclusion 

We have introduced a new method of estimating periods of periodic variables using 
multiband imaging data. We extended the existing AoV period estimation algorithm to 
incorporate data from multiple bands while maintaining the fundamental characteristics 
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of the single band algorithm. This allows for the use of relatively few observations 
(~ 25) per band across several bands while maintaining a reasonable level of completeness 
70 — 80%). We have also shown the importance of the (non-)simultaneity of observation 
timing. For a hxed number of observations per band, non-simultaneous observations offer 
better opportunity for period recovery than simultaneous observations. This effect of 
observational simultaneity has implications for the area of survey planning, particularly in 
the early period of surveys such as LSST, when the volume of data is not enough to render 
multiband period estimation redundant. It also has implications for non-transient surveys 
imaging helds at multiple epochs. By carefully choosing the observation time and band, 
our proposed mutiband algorithm can extract periods from data previously considered too 
poorly sampled to be of use. 
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Fig. 1.— The single band AoV periodograms and multiband periodogram constructed using 
Equation (5). The AoV statistic is an indication of how well the trial function of the AoV 
algorithm hts the light curve folded with period P, with higher values indicating a better 
ht. The multiband periodogram is given by the blue dashed line, while the ugriz single 
band periodograms are given by the solid black lines. The periodogram was generated using 
non-simultaneous downsampling (to 19 observations per band in 5 bands) of an RR Lyrae 
from Sesar et ah (2010). 
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Fig. 2.— Completeness fraction (number of periods correctly recovered divided by number 
of light curves attempted, period is successfully recovered if \PAig — PTrue\ < 0.001 days.) 
as a function of observations per band across the SDSS ugriz bands. The 483 RR Lyrae of 
Sesar et al. (2010) were used as a dataset, as described in section 2. Single band periods 
and one multiband period were estimated for each object. The single band completeness is 
obtained by dividing the number of correct single band period identideations by the total 
number of single band light curves attempted (i.e., 5x, 3x, or lx the number of RR Lyrae), 
while multiband completeness is calculated by dividing the number of multiband correct 
period identiheations by the number of RR Lyrae attempted. The solid (dashed) black 
line with circles represent the completeness fraction for 5 band (ugriz) non-simultaneous 
(simultaneous) downsampling. The green solid (dashed) lines and star markers is the same, 
but for 3 bands (gri) of data. The single band counterparts for non-simultaneous and 
simultaneous downsampling are given by the solid red line and dashed blue line with triangles, 
respectively. For simultaneous observations, the typical separation between observations is 
< 0.005 days. 
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Table 1. Number of RR Lyrae in sample 


nobs per band 

Non-simultaneous 5 band 

Simultaneous 5 band 

Non-simultaneous 3 band 

Simultaneous 3 band 

10 

450 

450 

450 

450 

13 

450 

449 

450 

450 

15 

450 

448 

450 

449 

17 

448 

447 

448 

448 

19 

448 

445 

448 

447 

21 

447 

445 

447 

445 

23 

445 

440 

445 

443 

25 

444 

438 

444 

441 




Fig. 3.— Comparison of best fit periods from this work to Sesar et al. (2010) for the multi¬ 
band method in the non-simultaneous (left) and simultaneous (right) cases. Periodograms 
were made using 15 observations per band. The solid line represents a 1;1 match, while the 
dashed lines represent beat frequencies. In general, the non-simultaneous data set fails along 
beat frequencies, while the simultaneous data set fails in a more random manner. 










